Spanish Synthesis Corpora
نویسندگان
چکیده
This paper deals with the design of a synthesis database for a high quality corpus-based Speech Synthesis system in Spanish. The database has been designed for speech synthesis, speech conversion and expressive speech. The design follows the specifications of TC-STAR project and has been applied to collect equivalent English and Mandarin synthesis databases. The sentences of the corpus have been selected mainly from transcribed speech and novels. The selection criterion is a phonetic and prosodic coverage. The corpus was completed with sentences specifically designed to cover frequent phrases and words. Two baseline speakers and four bilingual speakers were recorded. Recordings consist of 10 hours of speech for each baseline speaker and one hour of speech for each voice conversion bilingual speaker. The database is labelled and segmented. Pitch marks and phonetic segmentation was done automatically and up to 50% manually supervised. The database will be available at ELRA.
منابع مشابه
Corpora of latin american Spanish for research in prosody and synthesis
The present article describes the creation, labelling and main characteristics of a corpus of spoken Latin American Spanish. The corpus was collected with several objectives in mind: a) to fulfill our own research needs in the study of Latin American Spanish prosodic phenomena, where the absence of available corpora has already been noticed [1][6], b) to be able to experiment with prosodic mode...
متن کاملSpontaneous Speech Corpora for language learners of Spanish, Chinese and Japanese
This paper presents a method for designing, compiling and annotating corpora intended for language learners. In particular, we focus on spoken corpora for being used as complementary material in the classroom as well as in examinations. We describe the three corpora (Spanish, Chinese and Japanese) compiled by the Laboratorio de Lingüística Informática at the Autonomous University of Madrid (LLI...
متن کاملBilingual aligned corpora for speech to speech translation for Spanish, English and Catalan
In the framework of the EU-funded Project LC-STAR, a set of Language Resources (LR) for all the Speech to Speech Translation components (Speech recognition, Machine Translation and Speech Synthesis) was developed. This paper deals with the development of bilingual corpora in Spanish, US English and Catalan. The corpora were obtained from spontaneous dialogues in one of these three languages whi...
متن کاملGrapheme-To-Phoneme Transcription Rules For Spanish, With Application To Automatic Speech Recognition And Synthesis
Large phonetic corpora including both standard and variant transcriptions are available for many languages. However, applications requiring the use of dynamic vocabularies make necessary to transcribe words not present in the dictionary. Also, additional alternative pronunciations to standard forms have shown to improve recognition accuracy. Therefore, new techniques to automatically generate v...
متن کاملEvaluation of Corpus Assisted Spanish Learning
In the development of corpus linguistics, the creation of corpora has had a critical role in corpus-based studies. The majority of created corpora have been associated with English and native languages, while other languages and types of corpora have received relatively less attention. Because an increasing number of corpora have been constructed, and each corpus is constructed for a definite p...
متن کاملAutomatic prediction of emotions from text in Spanish for expressive speech synthesis in the chat domain Predicción automática de emociones a partir de texto en español para síntesis de voz expresiva en el dominio del chat
This paper describes a module for the prediction of emotions in text chats in Spanish, oriented to its use in specific-domain text-to-speech systems. A general overview of the system is given, and the results of some evaluations carried out with two corpora of real chat messages are described. These results seem to indicate that this system offers a performance similar to other systems describe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006